Product of Gaussians for speech recognition

نویسندگان

  • Mark J. F. Gales
  • S. S. Airey
چکیده

Recently there has been interest in the use of classifiers based on the product of experts (PoE) framework. PoEs offer an alternative to the standard mixture of experts (MoE) framework. It may be viewed as examining the intersection of a series of experts, rather than the union as in the MoE framework. This paper presents a particular implementation of PoEs, the normalised product of Gaussians (PoG). Here each expert is a Gaussian mixture model. In this work, the PoG model is presented within a hidden Markov model framework. This allows the classification of variable length data, such as speech data. Training and initialisation procedures are described for this PoG system. The relationship of the PoG system with other schemes, including covariance modeling schemes, is also discussed. In addition the scheme is shown to be related to a standard speech recognition approach, multiple stream systems. The PoG system performance is examined on an automatic speech recognition task, Switchboard. The performance is compared to standard Gaussian mixture systems and multiple stream systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Product of Gaussians as a distributed representation for speech recognition

Distributed representations allow the effective number of Gaussian components in a mixture model, or state of an HMM, to be increased without dramatically increasing the number of model parameters. Various forms of distributed representation have previously been investigated. In this work it shown that the product of experts (PoE) framework may be viewed as a distributed representation when the...

متن کامل

Response Time Reduction of Speech Recognizers Using Single Gaussians

In this paper, we propose a useful algorithm that can be applied to reduce the response time of speech recognizers based on HMM’s. In our algorithm, to reduce the response time, promising HMM states are selected by single Gaussians. In speech recognition, HMM state likelihoods are evaluated by the corresponding single Gaussians first, and then likelihoods by original full Gaussians are computed...

متن کامل

A comparative study of Gaussian selection methods in large vocabulary continuous speech recognition

Gaussian mixture models are the most popular probability density used in automatic speech recognition. During decoding, often many Gaussians are evaluated. Only a small number of Gaussians contributes significantly to probability. Several promising methods to select relevant Gaussians are known. These methods have different properties in terms of required memory, overhead and quality of selecte...

متن کامل

A Comparative Study of Gauss in Large Vocabulary Continuou

Gaussian mixture models are the most popular probability density used in automatic speech recognition. During decoding, often many Gaussians are evaluated. Only a small number of Gaussians contributes significantly to probability. Several promising methods to select relevant Gaussians are known. These methods have different properties in terms of required memory, overhead and quality of selecte...

متن کامل

Reduced gaussian mixture models in a large vocabulary continuous speech recognizer

Large vocabulary continuous speech recognition (LVCSR) systems usually employ several tens of thousands of gaussian mixture components for an accurate statistical representation of naturally spoken human speech. For applications that cannot e ort the computational expensive evaluation of numerous Gaussians during recognition time, it is an important question whether the number of Gaussians can ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computer Speech & Language

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2006